Project-Team:ROMA

Project-Team Roma

Team, Visitors, External Collaborators

Overall Objectives

Research Program

Application Domains

Applications of sparse direct solvers

Highlights of the Year

New Software and Platforms

MUMPS

New Results

Bilateral Contracts and Grants with Industry

Bilateral Contracts with Industry

Partnerships and Cooperations

Dissemination

Bibliography

Inria | Raweb 2019 | Presentation of the Project-Team ROMA | ROMA Web Site


	PDF	e-Pub

Previous |

Home | Next next

Section: New Results

High performance tensor–vector multiplication on shared-memory systems

Tensor–vector multiplication is one of the core components in tensor computations. We have recently investigated high performance, single core implementation of this bandwidth-bound operation. Here, we investigate its efficient, shared-memory implementations. Upon carefully analyzing the design space, we implement a number of alternatives using OpenMP and compare them experimentally. Experimental results on up to 8 socket systems show near peak performance for the proposed algorithms.

This work appears in the proceedings of PPAM2019 and is supported with a technical report [22], [36].

Previous |

Home | Next next